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ABSTRACT 


This thesis is a simulation study of the characteris- 
tic roots of the sample covariance matrix. The sample co- 
variance matrices used in the simulation are computed from 
normally distributed p-variate vectors. The vectors are 
generated from N(0,1) variates and the population covariance 
matrix with known eigenvalues. The eigenvalues of the popu- 
lation covariance matrix are arbitrarily chosen. Four non- 
zero eigenvalues are considered in each case. 

The empirical cumulative distribution of each of the 
non-zero eigenvalues is considered. 

The results show that the distribution of the non-zero 
equal roots show substantial variability. This variability 
is a function of both the number of equal roots and the rank 
order Of the root. Thesresults for the distinct yoots are 


fairly consistent. 
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CHAPTER 1 


SOME TESTS BASED ON THE ROOTS OF COVARIANCE 
MATRICES IN MULTIVARIATE ANALYSIS 


1.1 Introduction 

Multivariate analysis may be defined so the branch 
of statistical analysis which is concerned with the relation- 
ship of sets of variates, that is, the study of vectors of 
random variables whose components may be correlated with one 
another. 

The main reason for applying multivariate analysis 
is to solve problems and arrive at numerical results which 
can be used as the basis for decision making. 

The general procedure is as follows: 

(i) A hypothesis is proposed. 
(ii) Multivariate data is collected. 

(iii) A model for the sampling distribution is pro- 
posed on the basis of mathematical (usually) 
results. 

(iv) Data is examined and hypothesis tested. 

The majority of tests of significance used for testing 
hypotheses in multivariate analysis are based on the hypo- 
thetical sampling distribution of the characteristic roots 
(eigenvalues) of the multivariate analysis of variance 


(MANOVA) matrices and the sample covariance matrices. 
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The tests were derived on the assumption that the 
random samples are drawn from one or more p-variate normal 
populations. 

Many tests have been formulated for the MANOVA matrices 
and the joint distribution of the non-zero characteristic 


roots of these matrices in the null case is given by 
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where m,n are interpreted differently for different test 


situations. 


1.2 Tests Involving the Roots of the Single p-variate Matrix 


The sample covariance matrix S is given by 


LePERS es mae 
S=, Vl ) (xe-x)(X5-x) ] el 


Twi 


| By > 
ml ay vite 
a a >not ea 


an seta nobd ood aire wat 
Sheena Gre oi at ; 
a eo sentation 02 


Van ® : 
ip + i 
“Ge, @) ee i en ee e Nia + tp +9049 a 


i, 


: 
A 


a} i 


7 
1 


(4-2 <8 2° Se) a? Be 28 & 0} 


2sotisem AVOMAM ot Yo 22007 O48s-non Sit “S18 18... .¢ 70" 


* 
(seredtastusiet a Fa 
———_—-— = i = (o.m%.a)9 
ffi (1s t4enS)o7(T +h eM eet RS 


past dnanstiid 197 Yhinststt tb bederqreant 916 aM ” iw 


é 
t 7 | 


Where N is the sample size and 
N 
Vee lav 


S is an unbiased estimate of the population covariance matrix. 


A = (N-1)S 


ns 


The distribution of A (or S) is called the Wishart 
distribution and is the multivariate generalization of the 
univariate gamma distribution and therefore plays an import- 
ant part in statistical inference. 


The density function of A for A positive definite is 
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where C is the population covariance matrix. 


The joint distribution of the roots of A for C=I is 
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where A, > Ayn Dives Ay > 0 anerthescharacteristic roots of 
po CAN GERS One pit 1). 

Lawley [10] derived tests of significance for the 
characteristic roots of the sample covariance matrix. He 
also expressed the expected value TE(2.)] and the variance 


of the sample roots in terms of the population roots. 


Lawley's Tests 

Let 2, (VIS eke vDerohe Of Lie ke distinct, yoots or 
the sample. 

(i) If the first k roots of the population covariance 
matrix are distinct, and the remaining p-k roots are equal 
to 6 say, to test the hypothesis of equality of the p-k 


roots, the test statistic is given by 


p-k 
Const{-log (2, ,1---2 )/6 
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which is approximate c with 4(p-k)(p-k+1) degrees of freedom. 
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where q = p-k. 
(ii) If 6 is unknown the approximate ue has 3(p-k-1 
(p-k+2) degrees of freedom and the test of the hypothesis 


becomes 


Const{-log (2) 44+++%5) + (prk) 10g, (2p 444+ +42, )/(p-k) 3 
where 
k 
Const = n - k - ; (2q+1+5) + 52 y ] 3 
4 r=1 (A_-8) 
ie 
cE de See 
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6 q q 


When k=0 this reduces to the hypothesis of equality of the 


roots and Const is then given by 
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and the variance of hi, is 
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Where O(f(n)) signifies an expression order not exceed- 
Utes) Ol) 

In 1.10 and 1.11, n is the sample size (number of 
Vectors), if n is large 0(n72) and 0(n73) are insignificant. 

The above expressions are valid only if the roots are 
distinct, there are no available expressions for the non-zero 
equal roots. 

It follows from 1.10 and 1.11 that the mean and vari- 
ance and hence the distribution of each of the non-zero 
distinct roots depend on the other characteristic roots, but 
the extent of the dependence cannot be exactly determined 
from 1.10 and 1.11 as these expressions are not exact. 

To construct exact significance tests, and to be aware 
of the magnitude of possible sampling errors it is important 
to know the sampling distributions of the estimates of these 
roots. 

In this simulation study the sample chosen for the study 
of the empirical distributions of the sample roots is the 
single sample p-variate covariance matrix. 

The following points would be considered in this thesis: 

(1) The empirical cumulative distribution of each of 


the non-zero characteristic roots (population 
roots known). 
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(1) When the’ non-zero roots are distinct. 
(ii) When the non-zero roots contain equal roots. 


(2) A comparison of the distributions of some of the 
roots from different computer runs. 


(3) The general behaviour of the non-zero equal roots 
and the dependence of their distribution on the 
rank order of the roots. 
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CHAPTER 2 


GENERATING AND TESTING THE SIMULATED MODEL 


2.1 The Maximum Likelihood Estimates of C and u 

The model generated is based on the theory that the 
p-variate vectors used to compute the sample covariance 
matrix S are distributed N(0,C) where C, the population co- 
variance matrix is known. 


The density function of the multivariate normal is 


Saar: exp[%(X-n)~ cv! (x-u)J ae 
Tr 
where 
ag) 
or 
X 
p 


and wp is a vector of means. 


The maximum likelihood estimates of C and p are given by 


(x5 -X)(x4-X) 7] 
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S = ay A is the unbiased estimate of C. 


2.2 Computing a Covariance Matrix with Known Eigenvalues 


For every real symmetric matrix B, there exists an 


orthogonal matrix P such that 


where D is the diagonal matrix of the eigenvalues of B. 
If C is the covariance matrix (real and symmetric) with 


eigenvalues 


sAnsees p 


then 


where A is the diagonal matrix of eigenvalues ere 
Premultiplying by P and post-multiplying by P~ (since P 


is orthogonal) 


) to steats246 “‘beeetduy ods 27 A +h e220 
Ve 


. 2euleavitept3 nywont atin x PiteM s2er Is tO) 6 pitt :ugmed “Sf 


_ 


n6 2telxs sisdt .8 xivtom atifsmmye fsa7 yieve 107° 


ia 
sed? dove F xrvdem fenopo to 


: 
s.s q= 99°49 = 


.4 to zeulsvasote sit to xtviem Tonopetbh eat 2f a sn! Ww 


tiiw (oritemmye bos feet) xtvtem sonsinsves ont 2 DWE 7 
zaulsvaspts 
> a oe 


_ 


A = 9979 


Aes see ph 2auTeynapte: to ha at fanopetb s Mat 8 atom 


it uw ents am sae bins sud 
: "2 ie > a > 
a 7. ah 


a ~y i Ow 


oe 


"9 


_ 


= ALA, (C is a symmetric matrix) 
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P(A 2n2)P 
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= (PA#)(AP~) 
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and A* is the diagonal matrix of the square root of the 


eigenvalues of C. 


2.3 Computing P the Orthogonal Matrix Used in the Sample 


An arbitrary matrix Ay was chosen, such that 


0 0.0 0.0 0.0 
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so that AyAo is a symmetric matrix. 


The eigenvalues and eigenvectors of Ay Ag were then 
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calculated. The eigenvectors calculated were all orthogonal 
and these p eigenvectors formed the orthogonal matrix P. 
Having found P, the covariance matrix C was then computed 
using 2.3 and 2.4 

In the first case the eigenvalues of Ch, were chosen 


to be -5.,2,25.144.440¢for,i72) ,prand 
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C = (PA%)(PA%)~ 


2.4 Generating the Normal Random Vectors N(0,C 


From 2.3 we have 
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Let Y be distributed N(0,I) where Loxp is the identity 


matrix, and let 


Then X is distributed N(0,A,A)) 
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where 414; is the ij element of A, and Yyo+++9¥,) are p 


independent standard normal variables N(0,1). 


2.5 Method of Generating the Random Numbers N(0,1) 


The algorithm formulated by Chen [6] was used to gene- 
rate the N(0,1) numbers. 

This algorithm generates pseudo-random numbers for a 
32-bit word computer, for example the IBM 360/67 which was 
used in the simulation. 

The theory is based on the multiplicative congruential 
method given by 
31) 


, p 
Raa R._,(2 +k) (mod 2 


20 
where p is a positive integer greater than 2 and less than 
31, and k is any odd integer. 


The random deviates of the unit uniform distribution 
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are obtained by putting 


ee | 


finally, two independent random normal deviates of mean 0 
and variance 1 are produced from two independent uniform 
deviates U, and U. by the transformation suggested by Box 
and Muller [5] that is 


12" Cos 2mU, 2.8 


=2 
il 


(-2 log, Uy 


we 


-~< 
i 


9 (-2 log, U,) sin 2m, 2.9 


Chen chose to generate the two random uniform deviates 
at a time rather than one. Using the multiplicative congruen- 


tial method this becomes — 


2.10 
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Each generation of this dual type produces alternate 


numbers for the sequence. 


It was found that generators with k=3 perform better 
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than those with k=1. (2P4+k) is relatively prime to a 


and 
is an odd number, therefore k is an odd number. Chen used a 
value of k=3 in his generator. 

Values of p,=14 and p,=18 were found empirically by 
Chen to be the best among all possible combinations of 
aves Py> Po < 18. He further improved the generator by chang- 


ing 2.11 to give 


Rog = [Ro 4.1 (2'843)1(2'843) (mod 297) 


271 
He used a sample size of ilar’ random numbers to test the 
generator and the results of the tests performed seem satis- 
factory for the purpose of this simulation. 

10,000 random N(0,1) numbers were generated using 
Chen's algorithm for each of five different sets of starting 
values. The Kolmogorov-Smirnov test statistic was then 
used to test the goodness of fit of the sample to the N(0,1) 
distribution. Each sample set was divided into 162 intervals 
from -4 to 4. The maximum deviation from among all the sets 
was 0.008, which is in agreement with the results obtained 
by Chen. 

At the 5 percent level of significance the critical 
value of the Kolmogorov-Smirnov test is 0.0136. 

The hypothesis that the numbers generated are N(0,1) 
was therefore accepted at the five percent level of signifi- 


cance. 
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A matrix Y(pxN) of N(0,1) variates was generated with 


p=10 and N=100 


¥21 


Yp1 


and using A, from 2.4, the matrix of random vectors was 


computed from 


and putting X. = (X,-x) 
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S was then compuated as 
CRMRe ne: 


Testing the Goodness of Fit of the Model 


1,000 samples of S were generated from a population 
covariance matrix with four non-zero characteristic roots, 


and for each run the test statistic given by 


i 2 lc| =) 
Wizz (2P+1 - ptt) flog ap Da tacre Gn ast 2.15 


(which is approximately ce with %(p)(p+1) degrees of freedom 
was calculated. 

In order to compute this statistic some assumptions 
had to be made, since the zero roots in the population co- 
variance matrix C would render the statistic indeterminate. 
In all the goodness of fit tests that have been derived so 
far, the test statistic is dependent on the product of the 
population roots or estimates of them. 

In factor analytic methods, after the appropriate 
number of factors have been fitted, the remaining factors 
tend to be zero. This being the case 107° was considered a 
reasonable approximation for the zero roots. 

The 1,000 values of the test statistic were divided 
into 54 intervals from 28.0 to 97.0 and the Kolmogorov- 


Smirnov test was used to test the goodness of fit with a x? 
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distribution with 55 degrees of freedom. The maximum differ- 
ence was 0.031, at the five percent level of significance 

the critical value of the test is 0.043. Since the maximum 
deviation is below this value, the hypothesis that the sample 
covariance matrix S is a good representation of the popula- 
tion covariance matrix C was accepted at the five percent 


level of significance. 
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CHAPTER 3 
RESULTS 
Five Computer runs were made each with a population 
covariance matrix having four non-zero roots. The non-zero 


characteristic roots of the population matrix were arbitrar- 


ily chosen as follows: 


TABLE 1] 


INPUT DATA 


Value of the Population Roots 


Rank Order 
of Root Run | Run 2 Run 3 Run 4 Run 5 
] 5 5 4 7 4 
2 2 3 3 ] 2 
3 2 1 2 1 2 
4 ] 1 1 1 2 


Since various test procedures require that the char- 
acteristic roots be in descending order of magnitude, the 
roots were arranged in this order. 

1,000 samples were generated for each run with n=00, 


p=10, N,=1000. 


18 


er ve dstw iiss tak anu aosugned ot 
o1ss-nen att pee otss-fon 4 ved xthdon i Jos vo 
-tsividve avew Auten warsarupen aout to es00% steekes ue 


= 


:ewolTot 26 ‘au ert 


: 
f 3J8AT 


ATAG TUSHI 


efouh sottefugqod oni to aufsv . 
. | yab10 toss 

@owah PP nuh E mwA § nuh f nua too8 to 
a aa ee ~ 


= 


f 
$s 
€ 
& 


~Tsd> sat Jedd ottups1 zsxvbes01g seas evottey sonke 

ond ,sbutingsm to tebio eatbnepesb ni sd 2toor atte 

Taba. 2tnt nt bepneve ore at 

a ois WY tw muy Woee vot ee eotquse 000, f 


rh iat 
etl lig 
Ai 


12 


The sample mean, variance and skewness were calculated 


for each root using 


N 
] 
Sample mean 2 = ne ee 
Wel ae. 
where he is a root of the sample 
ew 
: 2 ae 
Sample variance SX" = ,—~ Jj (2.-2) 
N,-1 +7 j 
and 
A (2,-1)* 
SKEWNESS: =) Moa: 


E(2.) and Var(2,.) were also calculated using 1.10 and 


1.11. Since the sample used sample size N=100, t and 3 


n n 


aids respectively, these values 


are of the order 10 


are insignificant compared with the other values. These terms 


were therefore omitted in the calculations. 
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Since the matrix is symmetric only the lower triangu- 
lar portion is given. The values are rounded to three 


decimal places for each computer run. 


TABLE 2 


MEAN, VARIANCE AND SKEWNESS OF THE 
NON-ZERO ROOTS OF RUN 1 


Pop. Value Sample Sample Ee) Var(2,) 


of Root Mean Variance - y Skewness 
5.0 5.0805 0.5063 5.0800 0.495 Ome Za 
220 Zee loud, 0.0676 -- -- 0.4657 
20 1.7256 0.0454 -- -- 0.0914 
Ae 0.9466 0.0176 0.9475 0.0179 0.1711 


The expressions for E(2.) and Var(2) are not applicable 


r 
when the non-zero roots are equal. 

PLOsialetlis lsc, 0s gael 4) SNOW Bthe “CUMUdative distribu 
tion of each of the non-zero roots. 

Fig. 1.5 shows the distribution of roots 2 and 3 when 


the population root = 2.0. 
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TABLE 3 


MEAN, VARIANCE AND SKEWNESS OF THE 
NON-ZERO ROOTS OF RUN 2 


Sample Sample E(2,.) Var(2.,) 


Pop. Root Mean Variance Skewness 
5 Sa1013 0.5042 5.1010 0.488 0.2017 
3 229170 0.1641] 2.954 0.169 0.3287 
1 1.0929 0.0154 -- -- 0.4710 
] 0.8483 0.0122 -- -- 0.1441 


Figs. 2.1, 2.2, 2.3 and 2.4 show the cumulative distri- 
bution of each of these roots. 
Fig. 2.5 shows the cumulative distribution of roots 3 
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TABLE 4 


MEAN, VARIANCE AND SKEWNESS OF THE 


NON-ZERO ROOTS OF RUN 3 


3] 


Sample Sample E(2.) Var(2_) 
Pop. Root Mean Variance if Skewness 
4 Me Wiskews 0.3011 4.1740 0.2904 0.2756 
3 2.9189 0.1336 2.9545 0.1446 Unce 2s 
2 1.9029 0.0654 1.9192 0.0693 0.1939 
1 Oregioilie 0.0180 0.9512 0.0186 OF 2053 
Figs. 3.1, 3.2, 3.3 and 3.4 show the cumulative distri- 


bution of each of these 


roots. 
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Sample Sample E04) Var(2_) 
Pop. Root Mean Variance a r' Skewness 
7 7.0351 1.0011 150393 0.9898 0.1549 
1 1.1949 0.0168 -- -- 0.6369 
] OF 9732 0.0089 -- -- O21 792 
1 0.7820 0.0091 -- -- 0.1060 


Figs. 4.1, 4.2, 4.3 amd 4.4 show the cumulative 


bution of these roots. 
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Fig. 4.5 show the cumulative distribution of roots 2, 


3 and 4. 
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Pop. Root Mean Variance Skewness 
4 4.1244 0.3144 4.1212 C7336 0.2054 
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CHAPTER 4 


CONCLUSIONS 


4.1 Discussion of Results 

The values of E(2,.) and Var (2) FOmothe ke cistines 
non-zero roots using the expression given by Lawley were 
consistent with the values obtained using the sample mean 
and sample variance. The maximum discrepancy occurred for 
root 2 and run 3, where the sample mean calculated from the 
sample is 2.9189 and that calculated from Lawley's expression 
is 2.9545 giving a difference of 0.0356 from the sample or 
a discrepancy of 1.2 percent. The variance in this case is 
0.1336 from the sample and 0.1446 from Lawley's expression. 
The difference here is 0.011 or 7.9 percent greater than the 
sample value. 

Lawley's expression is applicable for the k distinct 
roots, but as seen from root 4 of run 1, all of the non-zero 
roots need not be distinct, the expression is also applicable 
where the non-zero root considered is distinct from the other 
roots. 

The equal roots show substantial variability in mean 
and skewness. This variability may be a function both of the 
size of the root and the number of equal roots in the popu- 
lation. 


From run 4 where the three population roots are equal 


44 


be a aoe 


tontt2rb A edd vot eles. 10 


stow Yslwst Yd navig notezsyqxe sit parey 8001 98 eo To 
neon siqmee oft piten bantetdo aeutsv sit dotw not: 2 _ 
*ot beTIU990 vonsgetae th, mbmt xem sHT ,aomeive¥ sine - - 
saz mort bogs Tuofes neon signee sad ovsiiw .— nua bas 8 ree a 
noteezsyqxs e'yofwel most bedsluofsa Jens bas are. $ at ‘ 7 : 
Yo sfqmee edt mort Ad€0.0 to Sovetettib « antvip zaae 

zt s262 ehds of sonstisy sft -.dnsaysq §$.f to yoregets 


Pen as 
’ 


ca 7 


rot 22540%S 2'¥sfwed mov? OBAT.O bas ofaqme2 silt mont 
sto ed W5tsetp tneotag @.1 4o TFO.0 ef s7eH somarsth 
-sufeyv al 

tonrtetb # afd vot al deotiqas 2t notezevqus 2'yoPwed . 
oias-non sit to Fis ,f mut to. # tooy mort nsez 26 fod. (29007 7 
afdéotiqas oafe 2f notzesnqus aft ,fontderb. ad son Been zs00% ras 
vedio sdt movt tontyetb zt bevebtenos #607 oras~non end oye 


nesm nt ella: Meni: woe ebaoy 


45 


and the population value = 1.0, the sample roots range from 
1.19 to 0.78 and in run 5 where the population value is 2.0, 
the values of the sample roots range from 2.37 to 1.55. For 
the case of two equal roots, when the population root = 2.0, 
the sample roots are 2.22 and 1.73 and for a population root 
of 1.0, the sample roots are 1.09 and 0.85 respectively. 
(The mean value of the sample root is used in each case.) 

FudStim dois Orel. oO and 2.5 show the variability in 
the cumulative distribution of these roots. 

This variability is also very marked in the skewness 
of the distribution of the equal roots. The first of the 
equal roots is very positively skewed, and there is a sharp 
drop from the first to the second. This is noticeable in all 
cases whether there are two equal roots or three equal roots. 
In the case of three equal roots, the difference in the skew- 
ness of the second and third roots is less substantial. The 
skewness decreases with the rank order of the root, this is 
also the case with the sample means of the equal roots. 

Since the number of equal roots is small, no generaliza- 
tions can be made concerning the variability of the sample 
roots, no obvious relationship can be seen between the sample 
roots and the parent population root, however, there seems 
to be some relationship among the roots themselves. 

In run 4 where the value of the population root is 1.0, 
the.values of the sample, roots are; 1.105.0497,..ande0<,79." dhe 
second root is 81.5 percent of the first and the third is 


81.4 percent of the second. In run 5 where the value of the 
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population root is 2.0, the sample roots are 2.365, 1.93 and 
1.55 respectively. The second root is 81.6 percent of the 
first and the third root is 80.3 percent of the second, so 
that the relationship among the roots for the case of three 
equal population roots is fairly consistent. 

In runs 1 and 2 where there are two equal roots in the 
population matrix, the second sample root is 78 percent of 
the first in both cases. 

The relationship seems to be independent of the size 
of the population root but is dependent on the number of 
equal roots. The value of the sample root which corresponds 
to one of the equal roots of a population covariance matrix 
is therefore a function of the value of the population root, 
the rank order of the root among the equal roots, and the 
number of equal roots. 

In order to establish some concrete results concerning 
the relationship among the sample roots when the population 
roots are equal, it would be necessary to obtain many samples 
for different numbers of equal roots in the population matrix. 

In most tests where the sample covariance matrix is 
used, the population covariance matrix is usually unknown. 
Since the equality of the roots in the population is not evi- 


dent in the sample roots, tests which are functions of a 


single root of the sample may have somewhat misleading results, 


unless a test is first made to determine the presence of non- 


zero equal roots in the population covariance matrix. 
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4.2 Comparison of Some Distributions 


The Kolmogorov-Smirnov two sample test was used to 
test whether sample roots corresponding to population roots 
of the same numerical value and rank order, but where the 
remaining roots varied, were drawn from the same population 


or from populations with the same distribution. 


TARGET? 


COMPARISON OF SOME DISTRIBUTIONS 


Rank Order 
Pop. Value Run Run of Root Max. Dev. 
5.0 1 Z 1 0.019 
4.0 3 S ] 0.049 
3:0 2 3 2 07035 
0 ] 3 4 0.018 
10 ] 2 4 OPCW 
20 ] 5 2 Oz 52 
eee ] 3 3 Oe eM 
120 2 4 a Orci 
10 2 3 4 0.346 


The critical value of the Kolmogorov-Smirnov test at 
the five percent level of significance is 0.061. In the 
first four cases the maximum deviation is below this value 
and the hypothesis that these roots are from populations with 
the same distribution is accepted at the five percent level 


of significance. 
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The deviation in each of the remaining five cases is 
very significant, and the hypothesis of equal populations 
is therefore rejected at the five percent level of signifi- 
cance. 

The results show that the roots which are distinct from 
all other non-zero roots in the population have the same 
distributions, while the non-distinct roots are from popula- 


tions with significantly different distributions. 


4.3 A Comparison of the Mean, Variance and Skewness of the 
Same Rank Order and Numberical Value for Four Different 
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TABLE 8 


MEAN, VARIANCE AND SKEWNESS OF THE 4th 
ROOT POPULATION VALUE = 1.0 


Run POD. ROO US Mean Variance Skewness 
] pay Aeal 0.947 0.018 Oat 71 

2 Diver list 0.848 OR ONi2 0.144 

3 ean Ary OR 9'5)1 0.018 0.205 

4 Jose lred Os7 82 0.009 0.106 


When the roots are distinct from the other non-zero 
roots, that is in runs 1 and 3, the mean and variance are 
close enough to be considered equal, the only discrepancy 


in this case is the skewness of the distribution. The distri- 
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bution of the root from the population with all the non-zero 
roots distinct (run 3) is more positively skewed than the 
other. The other results from runs 2 and 4 where the root 
considered is one of the equal roots, the mean and variance 
-in the case where there are two equal roots are greater, and 
the distribution more positively skewed than in the case where 
the root is one of three equal roots. 

The results given in this thesis show the marginal 
distributions of the individual roots of a matrix with a 
Wishart distribution in the central case. The density func- 
tions of these roots are still unknown, both in the central 
and non-central cases. 

Since these roots are used in various tests of signifi- 
cance, it is necessary that the sampling distributions should 
be known so that one would be aware of possible sampling 


errors and the power of the tests could be determined. 
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